counterfactual explanation and strategic behavior
Review for NeurIPS paper: Decisions, Counterfactual Explanations and Strategic Behavior
Weaknesses: The paper's biggest omission is that it only considers decision-maker utility as opposed to social welfare/decision subjects' utility. This is significant because the model and techniques proposed are inherently extractive in the following sense: the decision-maker can and will induce the subject to pay a cost of (say) .5 in order to improve the decision-maker's utility by .01. As noted in the paper, the hope is that the improvement is worth it to both the decision-maker and the subject, but there's no guarantee that this will actually be the case. I think the experiments should at least investigate this question: does social welfare ultimately increase? Are there individuals whose utility decreases compared to the non-strategic setting?
Review for NeurIPS paper: Decisions, Counterfactual Explanations and Strategic Behavior
This paper proposes and analyzes a model of strategic behavior under counterfactual explanations. In this model, a decision-maker chooses a policy and a small set of explanations that can be provided to decisions subjects who receive unfavorable decisions. In response, decision subjects follow the given explanation to improve their future outcomes. While doing so is NP Hard, the resulting formulation is shown to be submodular allowing for efficient approximations. This paper establishes an interesting connection between strategic behavior and explainability.
Decisions, Counterfactual Explanations and Strategic Behavior
As data-driven predictive models are increasingly used to inform decisions, it has been argued that decision makers should provide explanations that help individuals understand what would have to change for these decisions to be beneficial ones. However, there has been little discussion on the possibility that individuals may use the above counterfactual explanations to invest effort strategically and maximize their chances of receiving a beneficial decision. In this paper, our goal is to find policies and counterfactual explanations that are optimal in terms of utility in such a strategic setting. We first show that, given a pre-defined policy, the problem of finding the optimal set of counterfactual explanations is NP-hard. Then, we show that the corresponding objective is nondecreasing and satisfies submodularity and this allows a standard greedy algorithm to enjoy approximation guarantees.